Skip to content

[asan] Speed up ASan ODR indicator-based checking #100923

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 1, 2024

Conversation

artempyanykh
Copy link
Contributor

@artempyanykh artempyanykh commented Jul 28, 2024

Summary:
When ASan checks for a potential ODR violation on a global it loops over a linked list of all globals to find those with the matching value of an indicator. With the default setting 'detect_odr_violation=1', ASan doesn't report violations on same-size globals but it still has to traverse the list. For larger binaries with a ton of shared libs and globals (and a non-trivial volume of same-sized duplicates) this gets extremely expensive.

This patch adds an indicator indexed (multi-)map of globals to speed up the search.

Note: asan used to use a map to store globals a while ago which was replaced with a list when the codebase moved off of STL.

Internally we see many examples where ODR checking takes seconds (even double digits). With this patch it's practically free and __asan_register_globals doesn't show up prominently in the perf profile anymore.

There are several high-level questions:

  1. I understand that the intent is that we hit the slow path rarely, ideally once before the process dies with an error. But in practice we hit the slow path a lot. It feels reasonable to keep the amount of work bounded even in the worst case, even if it requires a bit of extra memory. But if not, it'd be great to learn about the tradeoffs.
  2. Poisoning based ODR checking remains on the slow path. Internally we build everything with -fsanitize-address-use-odr-indicator so I'm not sure if poisoning-based check would exhibit the same behavior (looking at the code, the shape looks very similar, so it might?).
  3. Globals with an ODR indicator of -1 need to be skipped for the purposes of ODR checking (cf. a257639). But they are still getting added to the list of globals and hence take up space and slow down the iteration over the list of globals. It would be a good saving if we could avoid adding them to the globals list.
  4. Any reason to use a linked list instead of e.g. a vector to store globals?

Test Plan:

  • cmake --build build --target check-asan looks good
  • Perf-wise things look good when linking against this version of compiler-rt.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Jul 28, 2024

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Artem Pianykh (artempyanykh)

Changes

Summary:
When ASan checks for ODR violation on a global it loops over a linked list of all globals to find those with the matching value of an indicator. For larger binaries with a ton of globals this gets extremely expensive: O(N^2) iterations + potentially poor memory locality from the linked list.

This patch adds an indicator indexed (multi-)map of globals to speed up the search. With a bit of extra memory this massively speeds up indicator based ODR checks.

Poisoning based ODR checking remains on the slow path.

Internally we see many examples where ODR checking takes seconds (even double digits). With this
patch, the same binaries where ODR checking used to take seconds, now it's practically free and
__asan_register_globals doesn't show up prominently in the perf profile anymore.

Notes and considerations:

  1. The map maintains a list of globals with the same indicator and it's not clear if we need the whole
    list or just the first instance of a global. With the default 'halt_on_error=1' just the first one
    would be enough, but with 'halt_on_error=0' we may need all (?)
  2. For poisoning based ODR check we still loop over all globals. We could do a similar map-based
    lookup in a different map here, but given that it's an older impl of ODR checking it's not clear how
    useful/important it'd be to speed it up. We build everything with -fsanitize-address-use-odr-indicator,
    so not terribly important to us, but perhaps a good follow up.
  3. GetGlobalsForAddress needs to loop over all globals. This functionality flows into
    __asan_decribe_address which is part of asan_interface. Perhaps switching from a linked list to a
    vector to store all globals could be a good middle-ground to speed up p.2 and p.3 iterations.
  4. Globals with an ODR indicator of -1 need to be skipped for the purposes of ODR
    checking (cf. a257639). But they are still getting added to the list of globals.
    a. We often see that the majority of inserted globals have an ODR indicator of -1
    and hence take up space and slow down the iteration over the list of globals.
    If we could skip them it would be a good saving, but given p.3 above it's not clear if we can.
    b. In the context of this patch, it means that the effect of introducing an indicator based map
    is two-fold: firstly, we're now dealing with only a smaller number of globals that should be
    checked for ODR, and secondly we can do a lookup in constant time rather than having to do a linear scan.

Test Plan:

  • cmake --build build --target check-asan looks good
  • Perf-wise things look good when linking against this version of compiler-rt.

Full diff: https://github.com/llvm/llvm-project/pull/100923.diff

1 Files Affected:

  • (modified) compiler-rt/lib/asan/asan_globals.cpp (+34-9)
diff --git a/compiler-rt/lib/asan/asan_globals.cpp b/compiler-rt/lib/asan/asan_globals.cpp
index d413b1ebc9fc0..26518c8ce4b98 100644
--- a/compiler-rt/lib/asan/asan_globals.cpp
+++ b/compiler-rt/lib/asan/asan_globals.cpp
@@ -21,6 +21,7 @@
 #include "asan_suppressions.h"
 #include "asan_thread.h"
 #include "sanitizer_common/sanitizer_common.h"
+#include "sanitizer_common/sanitizer_dense_map.h"
 #include "sanitizer_common/sanitizer_mutex.h"
 #include "sanitizer_common/sanitizer_placement_new.h"
 #include "sanitizer_common/sanitizer_stackdepot.h"
@@ -35,8 +36,11 @@ struct ListOfGlobals {
   ListOfGlobals *next;
 };
 
+typedef DenseMap<uptr, ListOfGlobals *> MapOfGlobals;
+
 static Mutex mu_for_globals;
 static ListOfGlobals *list_of_all_globals;
+static MapOfGlobals map_of_globals_by_indicator;
 
 static const int kDynamicInitGlobalsInitialCapacity = 512;
 struct DynInitGlobal {
@@ -73,6 +77,18 @@ ALWAYS_INLINE void PoisonRedZones(const Global &g) {
 
 const uptr kMinimalDistanceFromAnotherGlobal = 64;
 
+static void AddGlobalToList(ListOfGlobals *&list, const Global *g) {
+  ListOfGlobals *l = new (GetGlobalLowLevelAllocator()) ListOfGlobals;
+  l->g = g;
+  l->next = list;
+  list = l;
+}
+
+static void AddGlobalToMap(MapOfGlobals &map, const Global *g) {
+  ListOfGlobals *&in_map = map[g->odr_indicator];
+  AddGlobalToList(in_map, g);
+}
+
 static bool IsAddressNearGlobal(uptr addr, const __asan_global &g) {
   if (addr <= g.beg - kMinimalDistanceFromAnotherGlobal) return false;
   if (addr >= g.beg + g.size_with_redzone) return false;
@@ -147,14 +163,20 @@ static void CheckODRViolationViaIndicator(const Global *g) {
     *odr_indicator = REGISTERED;
     return;
   }
+  // Fetch globals with the same ODR indicator
+  auto *relevant_globals_lookup =
+      map_of_globals_by_indicator.find(g->odr_indicator);
+  if (!relevant_globals_lookup) {
+    return;
+  }
+  ListOfGlobals *relevant_globals = relevant_globals_lookup->second;
   // If *odr_indicator is DEFINED, some module have already registered
   // externally visible symbol with the same name. This is an ODR violation.
-  for (ListOfGlobals *l = list_of_all_globals; l; l = l->next) {
-    if (g->odr_indicator == l->g->odr_indicator &&
-        (flags()->detect_odr_violation >= 2 || g->size != l->g->size) &&
+  for (ListOfGlobals *l = relevant_globals; l; l = l->next) {
+    if ((flags()->detect_odr_violation >= 2 || g->size != l->g->size) &&
         !IsODRViolationSuppressed(g->name))
-      ReportODRViolation(g, FindRegistrationSite(g),
-                         l->g, FindRegistrationSite(l->g));
+      ReportODRViolation(g, FindRegistrationSite(g), l->g,
+                         FindRegistrationSite(l->g));
   }
 }
 
@@ -225,10 +247,13 @@ static void RegisterGlobal(const Global *g) {
   }
   if (CanPoisonMemory())
     PoisonRedZones(*g);
-  ListOfGlobals *l = new (GetGlobalLowLevelAllocator()) ListOfGlobals;
-  l->g = g;
-  l->next = list_of_all_globals;
-  list_of_all_globals = l;
+
+  AddGlobalToList(list_of_all_globals, g);
+
+  if (UseODRIndicator(g) && g->odr_indicator != UINTPTR_MAX) {
+    AddGlobalToMap(map_of_globals_by_indicator, g);
+  }
+
   if (g->has_dynamic_init) {
     if (!dynamic_init_globals) {
       dynamic_init_globals = new (GetGlobalLowLevelAllocator()) VectorOfGlobals;

@artempyanykh artempyanykh marked this pull request as draft July 29, 2024 08:52
@artempyanykh artempyanykh marked this pull request as ready for review July 29, 2024 09:58
@vitalybuka vitalybuka self-requested a review July 30, 2024 00:35
@vitalybuka
Copy link
Collaborator

Thanks for the patch!
We aware of performance issues in this check, but had not time to make it a priority.

@artempyanykh
Copy link
Contributor Author

Thanks for the context @vitalybuka! This has become a bit of a bottleneck for us, hence this patch.
And thanks for review comments @MaskRay!

Looking forward to hear what you think.

vitalybuka and others added 8 commits August 1, 2024 09:49
When ASan checks for ODR violation on a global it loops over a linked list of all globals to find
those with the matching value of an indicator. For larger binaries with a ton of globals this gets
extremely expensive: O(N^2) iterations + potentially poor memory locality from the linked list.

This patch adds an indicator indexed (multi-)map of globals to speed up the search. With a bit of
extra memory this massively speeds up indicator based ODR checks.

Poisoning based ODR checking remains on the slow path.
[asan] Speed up ASan ODR indicator-based checking

When ASan checks for a potential ODR violation on a global it loops over
a linked list of all globals to find those with the matching value of an
indicator. With the default setting 'detect_odr_violation=1', ASan
doesn't report violations on same-size globals but it still has to
traverse the list.

For larger binaries with a ton of shared libs and globals (and a
non-trivial volume of same-sized duplicates) this gets extremely
expensive.

This patch adds an indicator indexed (multi-)map of globals to speed up
the search.
… get unregistered

The test demonstrates a FN due to indicator value being shared among all globals with the same
indicator address.

Moving dlopen/dlclose around and/or adding more shared libs into the mix can show more edge cases
but this is out of the scope of this commit.
Copy link
Collaborator

@vitalybuka vitalybuka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Will land in 30min or so.

@artempyanykh
Copy link
Contributor Author

Thanks for the review @vitalybuka!

@vitalybuka vitalybuka merged commit c584c42 into llvm:main Aug 1, 2024
6 checks passed
Copy link

github-actions bot commented Aug 1, 2024

@artempyanykh Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested
by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as
the builds can include changes from many authors. It is not uncommon for your
change to be included in a build that fails due to someone else's changes, or
infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself.
This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

@vitalybuka
Copy link
Collaborator

BTW.
My concern is slowness of
__asan_after_dynamic_init/__asan_before_dynamic_init
They are 10x slower then unpatched register.
However it's not enabled by default, so maybe it's unused in your env.

I have patches for that and will CC you.

vitalybuka added a commit that referenced this pull request Aug 2, 2024
vitalybuka added a commit that referenced this pull request Aug 2, 2024
Will investigate soon.
The test is from #100923.
@artempyanykh
Copy link
Contributor Author

Not sure I've seen those prominently in profiles. Will double check. Thanks for the heads up @vitalybuka!

@artempyanykh
Copy link
Contributor Author

Just for my own curiosity, but @vitalybuka where/how DenseMap's destructor was causing problems? 54c9404

@vitalybuka
Copy link
Collaborator

vitalybuka commented Aug 2, 2024

Just for my own curiosity, but @vitalybuka where/how DenseMap's destructor was causing problems? 54c9404

I don't have a particular case, we avoid global destructors in general.

@vitalybuka
Copy link
Collaborator

and constructors

@artempyanykh artempyanykh deleted the fast-odr-indicator branch August 21, 2024 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants